This project introduces a machine learning approach for rapid COVID-19 detection using chest CT scans, addressing delays in RT-PCR testing. After enhancing and reducing the dimensionality of CT images, models like KNN, Decision Tree, SVM, and Random Forest were evaluated. The best—Random Forest, KNN, and SVM—were combined into a soft voting ensemble for improved accuracy. A Streamlit web app was developed to allow real-time image upload and instant COVID vs non-COVID predictions, providing a scalable diagnostic aid.
Introduction
The COVID-19 pandemic highlighted the need for rapid, accurate diagnosis to control virus spread. While RT-PCR is the standard test, it has limitations in sensitivity and speed. Medical imaging, especially CT scans, offers a valuable alternative due to higher sensitivity in detecting lung abnormalities related to COVID-19. However, interpreting CT scans requires expert radiologists, creating a bottleneck.
To address this, the project proposes an automated machine learning system that classifies CT scan images as COVID or Non-COVID, aiming to support healthcare professionals with a fast, reliable, and scalable diagnostic tool.
Objectives and Approach:
Enhance CT images and reduce feature dimensions using Principal Component Analysis (PCA).
Train multiple classifiers (Random Forest, KNN, SVM, etc.) and combine the best three via a soft voting ensemble to improve accuracy.
Deploy the model in a Streamlit web application enabling real-time diagnosis through user-friendly CT image upload.
Literature Review:
Prior studies have applied traditional and deep learning techniques on chest X-rays and CT images, achieving varying accuracies (75% to over 99%) using CNNs, ensemble methods, and feature selection.
Challenges include small datasets, lack of standardization, and limited clinical deployment.
Methodology:
Preprocess images by converting to grayscale, resizing, normalizing, and enhancing contrast.
Extract features and apply PCA to retain 90% variance for efficient modeling.
Train and evaluate multiple machine learning models individually.
Use soft voting ensemble to combine top-performing models (Random Forest, KNN, SVM).
Deploy the saved ensemble model on a Streamlit app for instant COVID/Non-COVID classification.
Implementation:
Developed in Python using libraries like Scikit-learn, OpenCV, and Streamlit.
Ensemble method averages prediction probabilities from individual classifiers for robust final decision.
Web app enables easy CT scan uploads and instant diagnostic feedback.
Results:
The system provides a practical, real-time tool for COVID-19 detection from CT scans, improving diagnostic speed and aiding medical professionals.
Conclusion
This project presents an effective machine learning-based system for detecting COVID-19 from chest CT scan images, achieving a high accuracy of 99.84%. It combines image preprocessing, histogram equalization, PCA-based dimensionality reduction, and multiple classifiers—KNN, SVM, and Random Forest—into a soft voting ensemble for robust performance. The system is deployed through a Streamlit web interface, enabling real-time image classification. Evaluation results, including a near-perfect confusion matrix, demonstrate the model’s reliability. This approach offers a fast, automated alternative to RT-PCR testing and holds promise for future integration with larger datasets and advanced deep learning techniques.
References
[1] Videla, Lakshmi Sarvani, et al. \"Convolution Neural Networks based COVID-19 Detection using X-ray Images of Human Chest.\" 2022 8th International Conference on Smart Structures and Systems (ICSSS). IEEE, 2022.
[2] Sayed, Safynaz Abdel-Fattah, Abeer Mohamed Elkorany, and Sabah Sayed Mohammad. \"Applying different machine learning techniques for prediction of COVID-19 severity.\" Ieee Access 9 (2021): 135697-135707.
[3] Tang, Shanjiang, et al. \"EDL-COVID: Ensemble deep learning for COVID-19 case detection from chest X-ray images.\" IEEE Transactions on Industrial Informatics 17.9 (2021): 6539-6549.
[4] Haq, Amin Ul, et al. \"Deep Learning Approach for COVID-19 Identification.\" 2021 18th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP). IEEE, 2021.
[5] Irmak, Emrah. \"A novel deep convolutional neural network model for COVID-19 disease detection.\" 2020 Medical Technologies Congress (TIPTEKNO). IEEE, 2020.